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Abstract. A fully-automated algorithm is developed able to show that evaluation of a 
given untyped A-expression will terminate under CBV (call- by- value). The "size-change 
principle" from first-order programs is extended to arbitrary untyped A-expressions in 
two steps. The first step suffices to show CBV termination of a single, stand-alone A- 
expression. The second suffices to show CBV termination of any member of a regular set 
of A-expressions, defined by a tree grammar. (A simple example is a minimum function, 
when applied to arbitrary Church numerals.) The algorithm is sound and proven so in 
this paper. The Halting Problem's undecidability implies that any sound algorithm is 
necessarily incomplete: some A-expressions may in fact terminate under CBV evaluation, 
but not be recognised as terminating. 

The intensional power of the termination algorithm is reasonably high. It certifies as ter- 
minating many interesting and useful general recursive algorithms including programs with 
mutual recursion and parameter exchanges, and Colson's "minimum" algorithm. Further, 
our type-free approach allows use of the Y combinator, and so can identify as terminating 
a substantial subset of PCF. 
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1. Introduction 

The size-change analysis by Lee, Jones and Ben-Amram [14j can show termination of pro- 
grams whose parameter values have a well-founded size order. The method is reasonably 
general, easily automated, and does not require human invention of lexical or other pa- 
rameter orders. It applies to first-order functional programs. This paper applies similar 
ideas to termination of higher- order programs. For simplicity and generality we focus on 
the simplest such language, the A-calculus. 

Contribution of this paper. Article [12j (prepared for an invited conference lecture) 
showed how to lift the methods of [Q] to show termination of closed A-expressions. The 
current paper is a journal version of [12]. It extends [12] to deal not only with a sin- 
gle A-expression in isolation, but with a regular set of A-expressions generated by a finite 
tree grammar. For example, we can show that a A-expression terminates when applied to 
Church numerals, even though it may fail to terminate on all possible arguments. This 
paper includes a number of examples showing its analytical power, including programs with 
primitive recursion, mutual recursion and parameter exchanges, and Colson's "minimum" 
algorithm. Further, examples show that our type-free approach allows free use of the Y 
combinator, and so can identify as terminating a substantial subset of PCF. 

1.1. Related work. Jones [TT] was an early paper on control- flow analysis of the untyped 
A-calculus. Shivers' thesis and subsequent work [221 US] on CFA (control flow analysis) 
developed this approach considerably further and applied it to the Scheme programming 
language. This line is closely related to the approximate semantics (static control graph) of 
Section [Ml [11]. 

Termination of untyped programs. Papers based on [13] have used size-change graphs to find 
bounds on program running times (Frederiksen and Jones [5]); solved related problems, e.g., 
to ensure that partial evaluation will terminate (Glenstrup and Jones, Lee [101 115j): and 
found more efficient (though less precise) algorithms (Lee [E]). Further, Lee's thesis [17] 
extends the first-order size-change method |14j to handle higher-order named combinator 
programs. It uses a different approach than ours, and appears to be less general. 

We had anticipated from the start that our framework could naturally be extended to 
higher-order functional programs, e.g., functional subsets of Scheme or ML. This has since 
been confirmed by Sereni and Jones, first reported in [19]. Sereni's Ph.D. thesis [21] develops 
this direction in considerably more detail with full proofs, and also investigates problems 
with lazy (call-by-name) languages. Independently and a bit later, Giesl and coauthors 
have addressed the analysis of the lazy functional language Haskell [8j . 

Termination of typed X-calculi. Quite a few people have written about termination based on 
types. Various subsets of the A-calculus, in particular subsets typable by various disciplines, 
have been proven strongly normalising. Work in this direction includes pathbreaking results 
by Tait [21] and others concerning simple types, and Girard's System F [9]. Abel, Barthe 
and others have done newer type-based approaches to show termination of a A-calculus 
extended with recursive data types [HOIS]. 

Typed functional languages: Xi's Ph.D. research focused on tracing value flow via data 
types for termination verification in higher order programming languages [28j, Wahlstedt 
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has an approach to combine size-change termination analysis with constructive type theory 
[261 [27]. 

Term rewriting systems: The popular "dependency pair" method was developed by 
Arts and Giesl ^ for first-order programs in TRS form. This community has begun to 
study termination of higher order term rewriting systems, including research by Giesl et.al. 
[3 [8], Toyama [25] and others. 

2. The call-by-value A-calculus 

First, we review relevant definitions and results for the call-by- value A-calculus, and 
then provide an observable characterisation of the behavior of a nonterminating expression. 

2.1. Classical semantics. 

Definition 2.1. Exp is the set of all A-expressions that can be formed by these syntax 
rules, where is the application operator (sometimes omitted). We use the teletype font 
for A-expressions. 

e,P ::= x|e@e|Ax.e 

X : := Variable name 

• The set of free variables fv{e) is defined as usual: fv{x) = {x}, fv{e@e') = fv{e) U fv{e') 
and fv{Xx.e) = fv{e) \ {x}. A closed A-expression e satisfies fv{e) = 0. 

• A program, usually denoted by P, is any closed A-expression. 

• The set of subexpressions of a A-expression e is denoted by subexp{e). 

The following is standard, e.g., [18j. Notation: /3-reduction is done by substituting v 
for all free occurrences of x in e, written e[v/x], and renaming A-bound variables if needed 
to avoid capture. 

Definition 2.2. (Call- by- value semantics) The call-by-value evaluation relation is defined 
by the following inference rules, with judgement form e ij- v where e is a closed A-expression 
and V € ValueS. ValueS (for "standard value") is the set of all abstractions Ax.e. 

--Ifve Values (ValueS) ^ Ax.eo e, ^ eo[v,M v 

Lemma 2.3. (Determinism) If e v and e JJ. w then v = w. 

2.2. Nontermination is sequential. A proof of e JJ. u is a finite object, and no such proof 
exists if the evaluation of e fails to terminate. Thus in order to be able to trace an arbitrary 
computation, terminating or not, we introduce a new "calls" relation e — > e', in order to 
make nontermination visible. 
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The "calls" relation. The rationale is straightforward: e ^ e' if in order to deduce e i}. v 
for some value v, it is necessary first to deduce e' i^. u for some n, i.e., some infer- 

e' JL ? ... 

ence rule has form — — —. Applying this to Definition 12.21 gives the following. 

e J| ? 

Definition 2.4. (Evaluation and call semantics) The evaluation and call relations are de- 
fined by the following inference rules, where C Exp x 

r d c 

® 1 'IJ' ^1 

If V e ValueS (Value) — (Operator) — (Operand) 



f JJ. w ei@e2 ei ei@e2 — > 62 

r d 

ei Ax.eo 62 ^2 . ei J| Ax.eo e2 JJ- 't;2 eo[v2/x\ij-v 

c 

For convenience we will sometimes combine the three into a single call relation — > = — > u 

r 

—>■ D . As usual, we write — >^ for the transitive closure of and — >* for its reflexive 

d c 

transitive closure. We will sometimes write s JJ- to mean s ij. v for some v G ValueS, and 
write s J/ to mean there is no f G ValueS such that s ]j. v, i.e., if evaluation of s does not 
terminate. 



A small improvement to the operational semantics. Note that rules (Callo) and (Applyo) 
from Definition 12.41 overlap: 62 JJ- V2 appears in both, as does eo[f2/x]. Thus (Callo) can be 
used as an intermediate step to simplify (Applyo), giving a more orthogonal set of rules. 
Variations on the following combined set will be used in the rest of the paper: 

Definition 2.5. (Combined evaluate and call rules, standard semantics) 



ei JJ. vi 

If V G ValueS (Value) — — (Operator) — — (Operand) 



V li V ei@e2 — > ei ei@e2 — > e2 

r d 

II \ II ei@e2 ^ e' e' J| f 

ei JJ, Ax.eo 62 ij. V2 ,^ c r^^^ 

e,@e2 - eo[.2/x] ^"^""''^ ^^^^v ^^^^'^^ 

The call tree of program P is the smallest set of expressions CT containing P that is closed 
under . It is not necessarily finite. 

Lemma 2.6. [NTS, or 'N_ontermination Is Sequential) Let P be a program. Then P ij- if and 
only if CT has no infinite call chain starting with P: 

P = eo ^ ei ^ 62 ^ . . . 



-'^Naming: r, d in , axe the last letters of operator and operand, and c in ^ stands for "call" . 
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Example: evaluation of expression ^1 = (Ax.x@x)@(Ay.y@y) yields an infinite call chain: 

n = (Ax.x@x)@(Ay.y@y) ^ (Ay.y@y)@(Ay.y@y) ^ (Ay.y@y)@(Ay.y@y) ^ ... 

By the NIS Lemma all nonterminating computations give rise to infinite linear call chains. 
Such call chains need not, however, be repetitive as in this example, or even finite. 

Informally gq J/ implies existence of an infinite call chain as follows: Try to build, 
bottom-up and left-to-right, a proof tree for eo -U- v. Since call-by-value evaluation cannot 
"get stuck" this process will continue infinitely, leading to an infinite call chain. Figure [1] 
shows such a call tree with infinite path starting with eg ^ ei ^ 62 ^ 63 — s- . . ., where 
— > = — > U — > U — > . The Appendix contains a formal proof. 

r d c 




eo JJ- ? Code: r = "Operator ",d = "Operand", c = "call". 



Figure [I].- Nontermination implies existence of an infinite call chain 



3. An approach to termination analysis 

The "size-change termination" analysis of Lee, Jones and Ben-Amram |14] is based on 
several concepts, including: 

(1) Identifying nontermination as caused by infinitely long sequential state transitions. 

(2) A fixed set of program control points. 

(3) Observable decreases in data value sizes. 

(4) Construction of one size-change graph for each function call. 

(5) Finding the program's control flow graph, and the call sequences that follow it. 
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The NIS Lemma establishes point [TJ However, concepts EJ [3l |4] and [5] all seem a priori 
absent from the A-calculus, except that an application must be a call; and even then, it is 
not a priori clear which function is being called. We will show, one step at a time, that all 
the concepts do in fact exist in call-by-value A-calculus evaluation. 

3.1. An environment-based semantics. Program flow analysis usually requires evident 
program control points. An alternate environment-based formulation remedies their absence 
in the A-calculus. The ideas were formalised by Plotkin [18j, and have long been used in 
implementations of functional programming language such as SCHEME and ML. 

Definition 3.1. (States, etc.) Define State, Value, Env to be the smallest sets such that 

State = { e : p [ e G Exp,p G Env and dom{p) 13 /f (e) } 

Value = { Ax.e : p \ Ax.e : p G State } 

Env = { p : X ^ Value \ A is a finite set of variables } 
Equality of states is defined by: 

ei : /9i = 62 : P2 holds if ei = 62 and pi{x) = p2{x) for all x G fv{ei) 

The empty environment with domain A = is written [] . The environment-based evaluation 
judgement form is s JJ- ?; where s G State, v G Value. 

The Plotkin-style rules follow the pattern of Definition 12.11 except that substitution (/?- 
reduction) eo[t'2/x] of the (CallS) rule is replaced by a "lazy substitution" that just updates 
the environment in the new (Call) rule. Further, variable values are fetched from the 
environment 

Definition 3.2. (Environment-based evaluation semantics) The evaluation relation -IJ-, is 
defined by the following inference rules. 

Ifv€ Value (ValueE) ^ _ ^ „ ^^^^ (VarE) 

(ApplyEo) 



V ij. V X : p JJ. p{x 

ei : p ij. Ax.eo : po 02 : p ij- V2 bq : po[x t-^ V2] JJ- v 



ei@e2 : p ij-v 



3.2. States are tree structures. A state has form s = e : p as in Definition 13.11 where 
p binds the free variables of e to values, which are themselves states. Consider, for two 
examples, these two states 

s = e:p = r@ (rOa) : [r 1-^ succ : [] , a 1— > 2 : []] 

s' = e':p' = r@(r@a) : [r 1-^ Aa. r@(r@a) : [r succ : []], a 2 : []] 

(written in our usual linear notation and using the standard Church numerals 0, 1^, 2, ... . 
For brevity details of the successor function succ are omitted. It is straightforward to verify 
that s J| 4 and s' JJ- 6 by Definition 13. 2[ 

More generally, each value bound in an environment is a state in turn, so in full detail 
a state's structure is a finite tree. (The levels of this tree represent variable bindings, not 
to be confused with the syntactic or subexpression tree structures from Figure O) 
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e p 
r@(r@a) : [ • ] 



succ: [] 2: [] 



e' 


p' 




\ 


/ 




Aa.rOCrOa) : [•] 


2: [] 






succ 


: [] 



Figure\^ Structures of two states s,s'. Each state is a finite tree. 

Figure [2] shows the structure of these two states, with abbreviations for Church numerals 
such as 2 = AsAz . s@(s@z). 



3.3. Nontermination made visible in an environment-based semantics. Straight- 
forwardly adapting the approach of Section 12. 2[ gives the following set of inference rules, 
variations on which will be used in the rest of the paper: 

Definition 3.3. (Combined evaluate and call rules, environment semantics) 



Ifv£ Value (Value) — , . (Var) 



V il- V x:pJJ. p(x 

(Operator) — ——^ — — (Operand) 



eii<JJe2 : p ei : p ei(y)e2 : p e2 : p 

II \ II ei@e2 : p ^ e' : p' e' : p' il v 

ei : p J| Ax.ep : pp 62 : p J| f2 ' ^ ^ c ^ ^ \ 

7:. f T~ (Call) — (Apply) 

ei@e2 : /o ^ eo : po[x ^ W2] ei@e2:pJ|v ^ ^ 



The following is proven in the same way as Lemma 12.61 

Lemma 3.4. (NIS, or Nontermination Is Sequential) Let V he a program. Then P : [] J| i/ 
and only if CT has no infinite call chain staring with P : [] (where = — > U U ^ j.' 

P : [] = eo : po ^ ei : pi ^ 62 : P2 ^ • • ■ 

Following the lines of Plotkin |18j , the environment-based semantics is shown equivalent 
to the usual semantics in the sense that they have the same termination behaviour. Further, 
when evaluation terminates the computed values are related by function F : States Exp 
defined by 

E{e : p) = e[F(p(xi))/xi, F(p(xfc))/xfc] where {xi, .., x^} = fv{e). 

Lemma 3.5. P : [] J| ?; (by Definition \3.2() implies P JJ- F{v) (by Definition \2.5\) . and 
P ]}■ w implies there exists v' such that P : W ij. v' and F{v') = w. 
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Example: evaluation of closed 17 = (Ax.x@x)@(Ay.y@y) yields an infinite call chain: 

!^ : [] = (Ax.x@x)@(Ay.y@y) : [] ^ x@x : pi y@y : /?2 ^ y@y : P2 ^ y@y : /02 ^ • • • 
where pi = [x ^ Ay.y@y : []] and p2 = [y 1-^ Ay.y@y : []]. 

3.4. A control point is a subexpression of a A-expression. The following subexpres- 
sion property does not hold for the classical rewriting A-calculus semantics, but does hold for 
Plotkin-style environment semantics of Definition 13.21 It is central to our program analysis: 
A control point will be a subexpression of the program P being analysed, and our analyses 
will trace program information flow to and from subexpressions of P. 

Lemma 3.6. //P : [] JJ. Ax.e : p then Ax.e G subexp{P). [Recall Definition 12. 1[ ] 

This is proven as follows, using a more general inductive hypothesis. 

Definition 3.7. The expression support of a given state s is exp-sup{s), defined by 

exp-sup{e : p) = subexp{e) U exp_sup{p{x)) 

xefv{e) 

Lemma 3.8. (Subexpression property) If s ij- s' or s ^ s' then exp-sup{s) 5 exp-sup{s'). 

Proof. This follows by induction on the proof of s JJ- or s — > s'. Lemma [3.6l is an immediate 
corollary. 

Base cases: s = x : p and s = Ax.e : p are immediate. For rule (Call) suppose 
ei : /) JJ. Ax. Go : po and e2 : /O JJ- f2- By induction 

exp-sup{ei : p) ^ exp-Sup{\x.eQ : po) and exp-sup{e2 : p) ^ exp-sup{y2) 

Thus 

exp-sup[ei@e2 : p) 5 exp-sup{ei : p) U exp-sup{e2 : p) □ 

exp-Sup{\x.eQ : po) U exp-sup{v2) 5 exp-Sup{eQ : pq[x i-^ ^2]) 

For rule (Apply) we have exp_sup{ei@e2 : p) 5 exp-sup{e' : p') ^ exp-sup{v). The cases 
(Operator), (Operand) are immediate. □ 

3.5. Finitely describing a program's computation space. A standard approach to 
program analysis is to trace data flow along the arcs of the program's dynamic control 
graph or DCG. In our case this is the call relation of Definition 12.51 Unfortunately 
the DCG may be infinite, so for program analysis we will instead compute a safe finite 
approximation called the SCG, for static control graph. 

Example 3.9. Figure [3] shows the combinator 0, = (Ax.x@x)@(Ay.y@y) as a syntax tree 
whose subexpressions are labeled by numbers. To its right is the "calls" relation — >. It has 
an infinite call chain: 

: [] ^ x@x : pi y@y : p2 y@y : p2 y@y : p2 ^ • • • 

Using subexpression numbers, the loop is 

1 : ^ 3 : pi ^ 7 : p2 ^ 7 : p2 ^ • • • 

where pi = [x 1-^ Ay.y@y : []] and P2 = [y ^ Ay.y@y : []]. The set of states reachable from 
P : [] is finite, so this computation is in fact a "repetitive loop." (It is also possible that a 
computation will reach infinitely many states that are all different.) 
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A-expression Q 
1® 

^\ 

2@ 6@ 

3® 7® 



The "calls" relation 

^ 2 : [] 




where pi = [x i-^ 6 : []] and P2 = [y i-^ 6 



Figure\^ The DCG or dynamic control graph of a X-expression 



By the NIS Lemma \'dA\ if P Jj^ then there exists an infinite call chain 

P : [] = eo : PC ^ ei : pi ^ 62 : P2 ^ • • • 

By Lemma l3.8( ej G subexp{P) for each i. Our termination-detecting algorithm will focus 
on the size relations between consecutive environments pi and p^+i in this chain. Since 
suhexp{p) is a finite set, at least one subexpression e occurs infinitely often, so "self-loops" 
will be of particular interest. 

Since all states have an expression component lying in a set of fixed size, and each 
expression in the environment also lies in this finite set, in an infinite state set iS there will 
be states whose environment depths are arbitrarily large. 



3.6. Static control flow graphs for A-expressions. The end goal, given program P, is 
implied by the NIS Lemma [33) correctly to assert the nonexistence of any infinite call chain 
starting at P : []. By the Subexpression Lemma [3. 81 an infinite call chain gq : po ^ ei : pi — > 
^2 '■ P2 ^ ■ ■ ■ can only contain finitely many different expression components ej. A static 
control flow graph (SCG for short) including all expression components can be obtained 
by abstract interpretation of the "Calls" and "Evaluates-to" relations (Cousot and Cousot 
[5). Figure a shows a SCG for VL. 



A-expression Control flow graph 




Figure^ The SCG or static control graph of a X-expression 

An approximating SCG may be obtained by removing all environment components 
from Definition 13.31 To deal with the absence of environments the variable lookup rule is 
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modified: If ei@e2 is any application in P sucli tliat ei can evaluate to a value of form Ax.e 
and 62 can evaluate to value V2, then V2 is regarded as a possible value of x. 

Although approximate, these rules have the virtue that there are only finitely many 
possible judgements e — > e' and e -J| e'. Consequently, the runtime behavior of program P 
may be (approximately) analysed by exhaustively applying these inference rules. A later 
section will extend the rules so they also generate size-change graphs. 

Definition 3.10. (Approximate evaluation and call rules) The new judgement forms are 
e J| e' and e — > e'. The inference rules are: 

/^r 1 AN ei@e2 e subexp(P) ei JJ- Ax.eo e2J|u2.,- 
(ValueA) (VarA) 



Ax.e 4 Ax.e x ^ f 2 

(OperatorA) — — (OperandA) 

ei(tSe2 —>■ 02 

a 

ei@e2 ^ e' e' ij. v 



ei(9e2 ei ei(Q)e2 e2 

r d 



e, ^ Ax.eo ^2 ^ V2 __ 

ei@e2 ^ eo ^ ' ei@e2 \y v v j ; 

c 

The (VarA) rule refers globally to P, the program being analysed. The approximate evalu- 
ation is nondeterministic, since an expression may evaluate to more than one value. 

Following is a central result: that all possible values obtained by the actual evaluation 
of Definition 13.31 are accounted for by the approximate evaluation of Definition 13.101 

Lemma 3.11. //P : [] ^* e : p and e : p \y e' : p' , then e ^ e'. 

//P : [] e : p and e : p ^ e' : p' , then e e'. 

Proof is in the Appendix. 



4. A QUICK REVIEW OF SIZE-CHANGE ANALYSIS 

Using the framework of [U], the relation between two states si and S2 in a call si S2 
or an evaluation si JJ- S2 will be described by means of a size-change graph G. 

Example 4.1. Let first-order functions f and g be defined by mutual recursion: 

f(x,y) = if x=0 then y else 1: g(x,y,y) 

g(u,v,w) = if w=0 then 3:f (u-l,w) else 2:g(u,v-l,w+2) 

Label the three function calls 1, 2 and 3. The "control fiow graph" in Figure [5] shows 
the calling function and called function of each call, e.g., 1 : f — > g. Associate with each 
call a "size-change graph", e.g., Gi for call 1, that safely describes the data fiow from the 
calling function's parameters to the called function's parameters. Symbol | indicates a value 
decrease. 

Termination reasoning: We show that all infinite size-change graph sequences A4 = 
9192 ■■ ■ € {Gi, G2, Gs}'^ that follow the program's control fiow are impossible (assuming 
that the data value set is well-founded): 

Case 1: G . . . {G2)'^ ends in infinitely many G2's: This would imply that variable 
V descends infinitely. 

Case 2: M. £ . . . {GiG2G-iY . This would imply that variable u descends infinitely. 



12 



N. D. JONES AND N. BOHR 



Size-change graph set Q 



X— > 


u 




V 




V 



u 


— >u 


V 




w 


w 



i 

X 



V y 



Control flow graph 



Gi 



Go 



G. 



Figure\^ Call graph and size-change graphs for the example first-order program. 

Both cases are impossible; therefore a cah of any program function with any data will 
terminate. End of example. 



G 



Definition 4.2. 

(1) A size-change graph A ^ B consists of a source set A] a target set B; and a set of 
labelecfl arcs G <Z A x {=, [} x B . 

(2) The identity size-change graph for yl is ^ *^ ^ where id a = {x ^ x | x G A). 

(3) Size-change graphs A B and C ^ D are composible if B = G. The composition of 
A^ B andB 



G2 


G 


is 


A '^ii^^ (7 where 






z 


I G { r, s X 


{x 




z 


1 {=} = {r, s [ X 



y G Gi and y 
y G Gi and y 



G 



z G G2 for some y G B} } 
z G G2 for some y G B} } 

G. 



Lemma 4.3. Composition is associative, and A ^ B implies id a] G = G; ids 

Definition 4.4. A multipath Ai over a set G of size-change graphs is a finite or infinite 
composible sequence of graphs in Q. Define 

0'^ = {M. = Go, Gi, . . . I graphs Gj, Gj+i are composible for i = 0, 1, 2, . . . } 

Definition 4.5. 



(1) A thread in a multipath jM = Go, Gi, G2, . . . is a sequence t 



such 



that Cfc ^ Ofc+i G Gfe for every /c > j (and each is = or j.) 
(2) Thread t is of infinite descent ii r^ = [ for infinitely many k > j. 

Definition 4.6. The size-change condition. 

A set Q of size-change graphs satisfies the size- change condition if every infinite 
multipath Ai (z contains at least one thread of infinite descent. 

Perhaps surprisingly, the size-change condition is decidable. Its worst-case complexity is 
shown to be complete for pspace in [14J (for first-order programs, in relation to the length 
of the program being analysed). 

The example revisited The program of Figure [5] has three size-change graphs, one for each 

of the calls 1 : f ^ g,2 : g ^ g,3 : g ^ f, so Q = {A B,B B,B A} where 
A = {x,y} and B = {u, v, w}. (Note: the vertical layout of size-change graphs in Figure [5] 

is inessential; one could simply write G3 = {u x, w ^ y}.) 

Q satisfies the size-change condition, since every infinite multipath has either a thread 
that decreases u infinitely, or a thread that decreases v infinitely. 



'Arc label ]~ signifying > was used in [IJ] instead of but this makes no difference in our context. 
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5. Tracing data size changes in call-by-value A-calculus evaluation 
The next focus is on size relations between consecutive environments in a call chain. 

5.1. Size changes in a computation: a well-founded relation between states. 
Definition 5.1. 

(1) A name path is a finite string p of variable names, where the empty string is (as usual) 
written e. 

(2) The graph basis of a state s = e : p is the smallest set gb{s) of name paths satisfying 



By this definition, for the two states in the example above we have gb(s) = {e,r, a} 
and gb{s') = {e,r, rr, a}. Further, given a state s and a path p G gb{s), we can find the 
substate identified by name path p as follows: 

Definition 5.2. The valuation function s : gb{s) — State of a state s is defined by: 



We need to develop a size ordering on states. This will be modeled by size-change arcs 

^ and The size relation we use is partly the "subtree" relation on closure values e : p, 
and partly the "subexpression" relation on A-expressions. 

Definition 5.3. 

(1) The state support of a state e : p is given by 



xe/D(e) 

(2) Relations )^2, t: and >- on states are defined by: 

• si yi S2 holds if support{si) 3 S2 and s\ ^ S2; 

• si >~2 S2 holds if s\ = ei : pi and 52 = ^2: P2i where suhexp(Q\) 3 62 and ei 7^ 62 
and Vx G jv{&2)-P\{x) = P2{x)- Further, 

• Relation is defined to be the transitive closure of U ;^2 U =. 

• Finally, si >- S2 if s\ >z S2 and s\ ^ S2 

Lemma 5.4. The relation >- C State x State is well-founded. 

We prove that the relation >- on states is well-founded by proving that 

ei : pi ^ 62 : P2 implies that {H{ei : pi),L(ei)) >iex {H{e2 : p2),L{e2)) 

in the lexicographic order, where H gives the height of the environment and L gives the 
length of the expression. The proof is in the Appendix. 

Lemma 5.5. If p G gb{s) then s >ri s(j)). If p G gb{s) and p e then s yi sip). 



gb{e :p) = {e} U {xp | x G fv{e) and p G gb{p{-£))} 



s{e) = s and e : /9(xp) = p(x)(p) 



support{e : p) = {e : p} U 
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6. Size-change graphs that safely describe a program 

6.1. Safely describing state transitions. We now define the arcs of the size-change 
graphs (recalhng Definition 14. 2p : 

Definition 6.1. A size-change graph G relating state si to state S2 has source gh{si) and 
target gh[s2). 

Definition 6.2. Let si = ei : pi and S2 = ^2 '■ P2- Size-change graph si^S2, G is saf^ for 
(si,S2) if 

Pi P2 ^ G implies sT(pi) = 52(^2) and pi P2 ^ G implies sT(pi) ^ sife) 

By dom{G) we denote the subset of source{G) from where arcs begin. By codom{G) we 
denote the subset of target{G) where arcs end. Notice that if a size-change graph G is safe for 
the states (si, S2), then any subset size-change graph G' C G with source{G') = source{G) 
and target{G') = target{G) is safe for (si,S2). 

Definition 6.3. A set Q of size-change graphs is safe for program P if P : [] ^* si — > S2 
implies some G € ^ is safe for the pair (si, 52)- 

Example 6.4. Figure [6] below shows a graph set Q that is safe for the program Vt = 
(Ax.x@x)(Ay.y@y). For brevity, each subexpression of is referred to by number in the 
diagram of Q. Subexpression 1 = has no free variables, so arcs from node 1 are labeled 
with size-change graphs Gq = 0. 



A-expression Set of size- change graphs Q = {Gq, Gi, G2, G3} 




Figure\^ A set of size- change graphs that safely describe Vt's nonterminating computation 

Theorem 6.5. // G is safe for program P and satisfies the size-change condition, then 
call-by-value evaluation ofP terminates. 

Proof. Suppose call- by- value-evaluation of P does not terminate. Then by Lemma [3. 41 there 
is an infinite call chain 

P : [] = eo : po ^ ei : pi ^ 62 : P2 ^ • • ■ 

Letting Si = ei : pi, by safety of Q (Definition 16. 3p , there is a size-change graph Gi € G 
that safely describes each pair (sj,Sj-|_i). By the size-change condition (Definition 14. 6p 

the multipath A4 = Go,Gi, . . . has an infinite thread t = aj ^ (^j+i ~^ ■ ■ ■ such that 
"^The term "safe" comes from abstract interpretation [IS]. An alternative would be "sound." 
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k > j implies a^+i € Gk, and each is J. or =, and there are infinitely many 

^A; =i- Consider the value sequence sy(aj), Sj+i(aj+i), . . .. By safety of Gk (Definition 
I6.2p we have Sfc(afc) ^ 5^+1(0^+1) for every k > j, and infinitely many proper decreases 
Sfc(afc) y Sk+i{ak+i)- However this is impossible since by Lemma [5.41 the relation >- on 
State is well-founded. 

Conclusion: call-by-value-evaluation of P terminates. □ 

The goal is partly achieved: We have found a sufficient condition on a set of size-change 
graphs to guarantee program termination. What we have not yet done is to find an algorithm 
to construct a size-change graph set G that is safe for P (The safety condition of Definition 16. 31 
is in general undecidable, so enumeration of all graphs won't work.) Our graph construction 
algorithm is developed in two stages: 

• First, the exact evaluation and call relations are "instrumented" so as to produce safe 
size-change graphs during evaluation. 

• Second, an extension of the abstract interpretation from Section 13.61 yields a computable 
over-approximation Q that contains all graphs that can be built during exact evaluation. 



6.2. Generating size-change graphs during a computation. We now "instrument" 
the exact evaluation and call relations so as to produce safe size-change graphs during 
evaluation. In the definition of the size-change graphs x, y, z are variables, and p, q can be 
variables or e, the empty path. Recall the valuation function for a state gives s{e) = s, so 
in a sense e is bound to the whole state. 

Definition 6.6. (Evaluation and call with graph generation) The extended evaluation and 
call judgement forms are e : p e' : p' ,G and e : /) JJ- e' : /)', G, where source{G) = 
fv(e)U{e} and target{G) = fv(e')U{e}. The inference rules are: 

— (ValueC) 

Ax.e : p J| Ax.e : p,id^^ ^ 

(OperatorC) : p i}, vi (OperandC) 



ei@e2 : p ^ ei : p, id^, ei@e2 : p 62 : p, id^ 



id"^ stands for {e — e} U {y ^ y | y G fv{e)} 
id^ stands for {e e} U {y ^ y | y € /w(e)} 

An arc y — y express that the state bound to the variable y is the same in both sides, 
before and after the evaluation or call. 

The e "represent" the whole state. In the (ValueC) rule the state Ax.e : p is the same in 
both sides and so there is an arc e ^ e. In the (OperatorC) and (OperandC) rules the state 
is smaller in the right hand side because we go to a strict subexpression and possibly also 

restrict the environment p accordingly. So there are e i e arcs. 

p(x) ^e' -.p' (VarC) 



X : p ^ p(x), {x y |y G fv{e') } u {x ^ e} 



In the (VarC) rule the state on the right side is p{x). This is the state which x is bound 
to in the environment in the left hand side, therefore we have an arc x — > e. Suppose 
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p{x) = e' : p' and y € fv{e'). Then y is bound in p' and this binding is then a subtree of 

i 

arc X — > y. 

ei : p ]j- Xx.eo : po,Gi 62:^-11^2,^2 



e' : p'. So we have an arc x -^^ y 



(CallG) 



In the definition of the size-change graphs used in the (CallG) rule x, y, z are variables, 
and p,q can be variables or e. In ^ the r can be either | or =. The construction of the 
size-change graph associated with the call is explained below. 
Gi "^l^^-^^ stands for cases 

X G hie^) : {y ^ z I y ^ z G Gi} U {e i z I e A z G Gi} 

X ^ fv{eo) : {y ^ z I y A z G Gi} U {e i g I e A g G Gi} U 

{p-^ e \ ee Gi} 

G|""^ stands for { y ^ x | y ^ e G G2} U { e x | e ^ e G G2 } 

G Ue G' stands for the restriction of G U G' such that the codomain C fv{e) U {e} 

First we consider how much information from Gi we can preserve. We have that the whole 
state ei@e2 : p in left hand side for the c-call is strictly larger than ei : p. The variable x 
is not free in Ax. eg and so does not belong to the target of Gi. If a variable z G fv(Ax.eo) 
is bound in po then it is bound to the same state in pq[x i— > ^2]. Therefore, if there is an 
arc y z in Gi, then it also safely describes the c-call and can be preserved. Also, if there 

r I 

is an arc e — > z in Gi, then an arc e — > z describes the c-call. Further, if x ^ fv{eQ) then 
eo : /9o[x I— > V2] = eo : pq and then Ax.gq : po ^ bq : pq[x V2] = bq ■ po- In this case, if 

there is an arc p ^ e going to e in Gi, then the arc p ^ e describes the c-call. 
Now consider which information we can gain from G2. We have that the whole state 
ei@e2 : p in left hand side for the c-call is strictly larger than e2 : p. If x G /i'(eo) then in 
eg : pq[x ^ V2\ we have that x is bound to the whole state in the right hand side for the 
evaluation of the operand. So in this case, if there is an arc y e in G2 then the arc y x 

describes the c-call, and if there is an arc e ^ e in G2 then the arc e ^ x describes the 
c-call. If X ^ /v(eo) then we cannot gain any information from G2. The restriction built 
into the definition of UgQ ensures that this holds. 

ei@e2 : p ^ e' : p',G' e':p'}iv,G 

— (ApplyG) 



ei@e2:p^v,{G'-G) 



The size-change graph (G';G) is the composition of the two graphs. 

In the size-change graphs generated by the rules above, the less-than relations (x i y) in 
(VarG)-rule arise from the sub-environment property of from Lemma [5. 5 1 The remaining 

relations i arise from the subexpression property of ;^2- The relations based on the sub- 
environment property capture the case that the state on the right hand side is fetched from 
the environment in the left hand side. The equality relations — > describe how values are 
preserved under calls and evaluations. 
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Lemma 6.7. s —>■ s' (by Definition 12. 5p iff s ^ s',G (by Definition 16. 6p for some G. 
Further, s JJ. s' iff s ^ s\G for some G. 

Theorem 6.8. (The extracted graphs are safe) 

■s — > s', G or s 4 s', G (by Definition l6.6p implies G is safe for (s, s') (with source and target 
sets extended as necessary). 

Lemma 16.71 is immediate since the new rules extend the old, without any restriction on 
their applicability. Proof of "safety" Theorem 16.81 is in Appendix. 









.-(Ax.e : //) 




A 


Ax.e : p' 









Figure^ Data-flow in a variable evaluation 




FigurelB^ Data-flow in an application 



The diagram of Figure [7] illustrates the data-flow in a variable evaluation. The diagram of 
Figure E] may be of some use in visualising data-flow during evaluation of ei@e2. States 
are in ovals and triangles represent environments. In the application ei@e2 : p on the left, 
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operator ei : p evaluates to Ax.eo : po,Gi and operand e2 : p evaluates to e' : p',G2- The 
size-change graphs Gi and G2 show relations between variables bound in their environments. 
There is a call from the application ei@e2 : p to bq : po[x e' : p'\ the body of the operator- 
value with the environment extended with a binding of x to the operand-value e' : p' . 

It is possible to approximate the calls and evaluates to relations with different degrees of 
precision depending on how much information is kept about the bindings in the environment. 
Here we aim at a coarse approximation, where we remove all environment componentsQ 

6.3. Construction of size-change graphs by abstract interpretation. We now ex- 
tend the coarse approximation to construct size-change graphs. 

Definition 6.9. (Approximate evaluation and call with graph generation) 

The judgement forms are now e — > e'jG and e JJ. e', G, where source(G) = fv(e)U{e} and 

target(G) = fv(e')U{e}. The inference rules are: 

ei@e2 € s'M6ea;p(P) eiJ| Ax.eo, Gi ^2\v2,G2 
(ValueAG) ' , , , = , (^^^^^^ 



Ax.e J| Ax.e, idy^^ ^ x JJ, t;2, {x ^ y | y G /u(w2)} U {x ^ e} 



(OperatorAG) — (OperandAG) 



ei@e2 ei, id^._^ ei@e2 — > e2, id\, 

r ^ a 



ei^ Ax.eo, Gi e2^t;2,G2 ,r.u^r.^ ^^^^2 ^ e ,G ei^v,G 

-— (CallAG) 1 (ApplyAG) 

ei@e2 ^ eo, G^ Ueo Gr=^ ei@e2 ij- v,G';G ^ ^ ' 

Lemma 6.10. Suppose P : [] e : p. If e : p ^ e' : p' ,G by definition \6.6\ then e e', G. 
Further, if e : p ]j. e' : p' ,G then e JJ. e', G. 

Proof. Follows from Lemma |3. lit see the Appendix. □ 
Definition 6.11. 

absint{P) = {Gj \j > 0A3ej,Gi(0 <i<j):P = eoA(eo ei, Gi) A. . . A(ej_i ^ e^, Gj) } 
Theorem 6.12. 

(1) The set absint{P) is safe for P. 

(2) The set absint{P) can be effectively computed from P. 

Proof. Part 1: Suppose P :[] = sq ^ si Sj. Theorem 16.81 implies Si — >■ Si+i,Gi 

where each Gj is safe for the pair (sj, Si+i). Let Si = : pi. By Lemma fG. 101 ~^ ^i+i^ Gi. 
By the definition of absint(P), Gj € absint{P) . 

Part 2: There is only a fixed number of subexpressions of P, or of possible size-change 
graphs with source and target C {e} U {x | x is a variable in P }. Thus absintiP) can be 
computed by applying Definition [6]9] exhaustively, starting with P, until no new graphs or 
subexpressions are obtained. □ 



''it is possible to keep a little more information in the graphs than we do here even with no knowledge 
about value- bindings in the environment. We have chosen the given presentation for simplicity. 
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7. Some examples 



7.1. A simple example. Using Church numerals (n = AsAz.s"(z)), we expect 2 succ 
to reduce to succ (succ 0). However this contains unreduced redexes because cah-by-vahie 
does not reduce under a A, so we force the computation to carry on through by applying 2 
succ to the identity (twice). This gives: 

2 succ idl id2 where 

succ = Am.As.Az. m s (s z) 
idl = Ax.x 
id2 = Ay.y 

After writing this out in full as a A-expression, our analyser yields (syntactically sugared): 
[As2.Az2.(s2 @ (s2 z2))] — two — 



@ [Am.As.Az. 1 15: | ( (m(§s)@(s@z) )] — succ 

[Asl.Azl.zl] — zero 

[Ax.x] ~ idl - 

[Ay.y] ~ id2 - 

Output of loops from an cmalysis of this program: 
15^* 15: [(m,>,m) , (s, = ,s) , (z, = ,z)] , [] 



Size-Change Termination: Yes 

The first number refers to the program point, then comes a list of edges. The loop occurs 
because application of 2 forces the code for the successor function to be executed twice, with 
decreasing argument values m. The notation for edges is a little different from previously, 

here (m,>,m) stands for m i m. 



7.2. fnx = x + 2"' by Church numerals. This more interesting program computes fnx = 
X + 2" by higher-order primitive recursion. If n is a Church numeral then expression n g 
x reduces to g"(x). Let x be the successor function, and g be a "double application" 
functional. Expressed in a readable named combinator form, we get: 

fnx where 

f n = if n=0 then succ else g(f (n-1)) 
g r a = r(ra) 

As a lambda-expression (applied to values n = 3, x = 4) this can be written: 
[An . Ax . 

[Ar.Aa.' 



n 



11: 



13: 



(r@a))] 



[A k. 
X ] 



A s.A 



(r@ 

z. (s@((k@s)@z))] 



— n — 

~ g ~ 

— succ- 

— X — 



@ [As2.Az2. (s2@(s2@(s2@z2))) ] 

[Asl.Azl. (sl@(sl@(sl@(sl@zl))))] 



— 3 — 
~ 4 — 
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Following is the output from program analysis. The analysis found the following loops from 
a program point to itself with the associated size-change graph and path. The first number 
refers to the program point, then comes a list of edges and last a list of numbers, the other 
program points that the loop passes through. 

SELF Size-Change Graphs, no repetition of graphs: 



11 ^* 11: [(r,>,r)] [] 

11 11: [(a,=,a),(r,>,r)] [13] 

13 ^* 13: [(a,=,a),(r,>,r)] [11] 

13 ^* 13: [(r,>,r)] [11,11] 



Size-Change Termination: Yes 



7.3. Ackermann's function, second-order. This can be written without recursion us- 
ing Church numerals as: a m n where a = Am. m b succ and b = Ag.An. n g (g 1). 
Consequently a m = b^Csucc) and b g n = g'^+^Cl), which can be seen to agree with the 
usual first-order definition of Ackermann's function. Following is the same as a lambda- 
expression applied to argument values ni=2 , n=3, with numeric labels on some subexpres- 
sions. 



13: 



(Am.m b succ) 2 3= (Aiii.m(ab(asucc)(a2(a3 
( Am . m@ ( Ag . An . nOgO (g@l ) ) Osucc) @2@3 
( Am . m® ( Ag . An . | 9: | (n(3g(3 
where 

1 = Asl.Azl. 

Ak . As . Az . 



(g(ai)))(asucc)(a2(a3 



17: 



succ 
2 
3 



23: 



(slOzl) 
(s@ 



25: 



(kOsOz)) 



As2.Az2. 
As3.Az3. 



s2(a(s2(Sz2) 



39: (s3@41: (s3@43: (s3@z3))) 



Output from an analysis of this program is shown here. 

(It is not always the case that the same loop is shown for all program points in its path) 
SELF Size-Change Graphs, no repetition of graphs: 



9 


^* 


9 


[ 


;e,>,n) , (g. 


>,g)] 


[13] 


9 


^* 


9 


[ 


;g,>,g)] 




[17] 


13 


* 


13 


[ 


:g,>,g)] 




[9] 


17 


* 


17 


[ 


;si,>,si)] 




[9] 


23 




23 


[ 


;k,>,k) , (s. 


=,s) , (z,=,z)] 


[25] 


23 


* 


23 


[ 


[S,> ,3)1 




[9] 


23 


* 


23 


[ 


;s,>,s) , (z, 


>,k)] 


[25,17,9] 


25 




25 


[ 


;k,>,k) , (s, 


=,s) , (z,=,z)] 


[23] 


25 


* 


25 


[ 


;s,>,s) , (z. 


>,k)] 


[17,9,23] 


25 


^* 


25 


[ 


;s,>,s)] 




[23,9,23] 


39 


^* 


39 


[ 


;s3,>,s3)] 




[9] 


41 


^* 


41 


[ 


's3,>,s3)] 




[9,39] 


43 


* 


43 


[ 


;s3,>,s3)] 




[9,39,41] 



Size-Change Termination: Yes 
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7.4. Arbitrary natural numbers as inputs. The astute reader may have noticed a lim- 
itation in the above examples: each only concerns a single X-expression, e.g., Ackermann's 
function applied to argument values m=2, n=3. 

In an implemented version of the A-termination analysis a program may have an ar- 
bitrary natural number as input; this is represented by •. Further, programs can have as 
constants the predecessor, successor and zero-test functions, and if-then-else expressions. 
We show, by some examples using •, that the size-change termination approach can handle 
the Y-combinator. 

In Section [8] we show how to do size-change analysis of A-expressions applied to sets of 
argument values in a more classic context, using Church or other numeral notations instead 
of.. 



7.5. A minimum function, with general recursion and Y-combinator. This pro- 
gram computes the minimum of its two inputs using the call-by- value combinator Y = Ap . 
[Aq.p(§(As.q@q@s)] @ [At .p@(Au.t@t@u)] . The program, first as a first-order recursive 
definition. 

m X y = if x=0 then else if y=0 then else succ (m (pred x) (pred y)) 



Now, in A-expression form for analysis. 

{Ap. [Aq.p@(As .q@q@s)] @ [At . p@ (Au . tOtOu) ] } 



the Y combinator 



[Am . Ax . Ay . 



27: 



if ((ztst 
0, 



32: 



x), 

if ((ztst @ y) , 
0, 

succ @ 



37: 



39: 



Output of loops from an analysis of this program: 



m @ (predSx) @ (predOy)] 



27 - 


27: 


[(x 


> 


x) , (y,>,y)] 


[32,37,39] 


32 


32: 


[(x 


> 


x) , (y,>,y)] 


[37,39,27] 


37 - 


37: 


[(x 


> 


x) , (y,>,y)] 


[39,27,32] 


39 - 


39: 


[(x 


> 


x) , (y,>,y)] 


[27,32,37] 



Size-Change Termination: Yes 



7.6. Ackermann's function, second-order with constants and Y-combinator. Ack- 
ermann's function can be written as: a m n where a m = b"'(suc) and b g n = g'^+-'-(l). 
The following program expresses the computations of both a and b by loops, using the Y 
combinator (twice). 
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[A y.A yl. 
(yl 
A a. A m. 



11: 



if( (ztstSm) , 

A V . (suc@v) , 

( (y @ 



19: 



A b.A f.A n. 

if( (ztstSn) , 
(fOl), 



25: 



29: 



32: 



34: b @ f @ (predSn)) 



41: 



a (predSm) 



] 



{Ap. [Aq.p@(As . q@q@s)] @ [At . p@ (Au . tOtOu) ] } 
@ {Apl. [Aql.plOCAs. 



72: 



qlOlqSsl)] [Atl . pl@ (Aul . | 81 : | tl@tlQul)]} 



Output of loops from an analysis of this program: 
SELF Size-Chcinge Graphs no repetition of graphs: 



11 - 




11 


[(a,>,y; 


, (m,>,m)] 


[19,41,72] 


11 - 


* 


11 


[ (m , > , m] 


] 


[19,41,72,11,19,41,72] 


19 - 




19 


[(a,>,y; 


, (m,>,m)] 


[41,72,11] 


19 - 


^ 


19 


[ (m , > , 


] 


[41,72,11,19,41,72,11] 


25 - 


^ 


25 


[(f ,>,b; 


,(f ,>,f)] 


[29] 


25 - 


^* 


25 


[(f ,=,f; 


, (n,>,n)] 


[32,34] 


25 - 


^* 


25 


[(f ,>,f; 


] 


[29,25,32,34] 


29 - 


^* 


29 


[(f ,>,f; 


] 


[25] 


32 


* 


32 


[(f ,>,b; 


,(f ,>,f)] 


[25] 


32 


* 


32 


[(f ,=,f: 


, (n,>,n)] 


[34,25] 


32 


* 


32 


[(f ,>,f; 


] 


[25,32,34,25] 


34 - 


* 


34 


[(f ,=,f: 


, (n,>,n)] 


[25,32] 


34 - 




34 


[(f ,>,b; 


,(f ,>,f)] 


[25,29,25,32] 


34 - 


* 


34 


[(f ,>,f; 


] 


[25,29,25,32,34,25,32] 


41 - 


* 


41 


[ (m , > , 


] 


[72,11,19] 


72 - 


^ 


72 


[(sl,>,£ 


3l)] 


[11,19,41] 


81 - 




81 


[(ul,>,ul)] 


[11,19,41] 



Size-Change Termination: Yes 

7.7. Imprecision of abstract interpretation. It is natural to wonder whether the gross 
approximation of Definition 13.101 comes at a cost. The (VarA) rule can in effect "mix up" 
different function apphcations, losing the coordination between operator and operand that 
is present in the exact semantics. 

We have observed this in practice: The first time we had programmed Ackermann's 
using explicit recursion, we used the same instance of Y-combinator for both loops, so 
the single Y-combinator expression was "shared". The analysis did not discover that the 
program terminated. 
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However when this was replaced by the "unshared" version above, with two instances 
of the Y-combinator (y and yl) (one for each appUcation), the problem disappeared and 
termination was correctly recognised. 



7.8. A counterexample to a conjecture. Sereni disproved in [20^ [2T] our conjecture 

that the size-change method would recognise as terminating any simply typed A-expression. 
The root of the problem is the imprecision of abstract interpretation just noted. A counter- 
example: the A-expression 

E = {Xa.a{Xb.a{Xcd.d))){Xe.e{Xf.f)) 
is simply-typable but not size-change terminating. Its types are any instantiation of 

T 



a 




b, c 


T — 


d 




e 




f 


T 



8. Arbitrary A-regular program inputs (Extended A-galgulus) 

Above we have analysed the termination behaviour of a single closed A-expression. We 
now analyse the termination behaviour for a program in the A-calculus for all possible inputs 
from a given input-set of A-expressions (e.g., Church numerals). The first step is to define 
which sets of A-expressions we consider. A well-defined input set will be the set of closed 
expressions in the "language" generated by a A-regular grammar. 

We extend the syntax and semantics of the A-calculus to handle expressions containing 
nonterminals. An extended lambda term represents all instances of a program with input 
taken from the input set. If our analysis certifies that the extended term terminates, then 
this implies that the program will terminate for all possible inputs. 

8.1. A-regular grammars. We are interested in a A-regular grammar for the sake of the 
language that it generates: a set of pure A-expressions (without nonterminals). This is done 
using the derivation relation =^p, soon to be defined. 

Definition 8.1. 

(1) A X-regular grammar has form T = (N, 11) where A is a finite set of nonterminal symbols 
and n is a finite set of productions. 

(2) A T-extended X-expression has the following syntax: 

e, P ::= x | A | e @ e | Ax.e 
A ::= Non- terminal name, A € A 

X ::= Variable name 

Expr denotes the set of F-extended A-expressions. Exp denotes the set of pure A- 
expressions (without nonterminals). Clearly Expr 5 Exp. 

(3) A production has form A ::= e where e is a F-extended A-expression. 
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Definition 8.2. Let nt(e) = {Xi, . . . ,Xjt} denote the multi-set of nonterminal occurrences 
in e G Expr- The derivation relation =^p C Exp-p x Exp is the smallest relation such that 

(1) If nt(e) = {Xi, . . . ,Xfc} and Xj tj e Exp for i = 1, . . . , fc, 
then e e[ti/Xi, . . . ,tfe/Xfe] 

(2) If A ::= e G r and e =>l e' then A e'. 

Notice that ^p relates extended A-terms to pure A-terms. 

In the above definition 18.21 nt(e) = {Xi, . . . ,Xfc} denotes the multi-set of nonterminals 
in e so two different Xj,Xj may be instances of the same nonterminal A. In the substitu- 
tion e[ti/Xi, . . . ,tfc/Xfc] such two different instances of a nonterminal may be replaced by 
different pure A-terms. 

Example 8.3. A grammar for Church Numerals: Consider 

r = ({C,A},{C ::= AsAz.A, A ::= z, A::=s@A} 

Here A =>p v iff v has form s"'(z) for some n > 0. Clearly C =>p v iff v has form A sA z . s"'(z) 
for some n > 0. 

The following assumption makes proofs more convenient; proof is standard and so omitted. 

Lemma 8.4. For any X-regular grammar Tq there exists an equivalent X-regular grammar 
Ti such that no production in Ti has form A ::= A' where A' G A^. We henceforth assume 
that all productions in a X-regular grammar have form A ::= e where e ^ N. □ 

Definition 8.5. In the following e is a F-extended A-expression: 

(1) Define the free variables of e by fv{e) = {x | 3t.e ^p t and x G fv{t)} 

(2) Define that e is closed iff t is closed for all t such that e =>p t. It follows that e is closed 
iff/^;(e) = {}. 

(3) Define suhterms{e) inductively by: 
For a variable x: subterms{x) = {x}. 

For an abstraction Ax.e: subterms{Xx.e) = {Ax.e} U subterms{e). 

For an application ei@e2: subterms{ei@e2) = {ei@e2}Usubterms{ei)L)subterms{e2). 
For a nonterminal A: subterms{k) = {A}. 

(4) Define subexps{e) as the smallest set satisfying: 
For a variable x: subexps{x) = {x}. 

For an abstraction Ax.e: subexps{Xx.e) = {Ax.e} U subexps{e). 

For an application ei@e2: subexps{ei@e2) = {ei@e2} U subexps{ei) U subexps{e2). 
For a nonterminal A: subexps{k) = {A} U {t | 3e.A ::= e G F and t G subexps{e)^. 

If e' G subterms{e) then e' is syntactically present as part of e. 

If e' G subexps{e) then e' is either a subterm of e or a subexpression of a nonterminal 
A G subterms{e). 

Sets subterms{e), subexps{e) are both finite, and subterms{e) = subexps{e) for ex- 
pressions e in the pure A-calculus. 

Example 8.6. In the grammar for Church Numerals C is a closed F-extended expression, 
but A is not a closed F-extended expression. Further, subexps{k) = {A, z, s@A, s}, 
subexps{<Z) = {C, AsAz.A, Az.A, A, z, s@A, s}, /?;(C) = {}, /?;(A) = {s, z} 

Lemma 8.7. Let x be a variable. If A =>^ x then A ::= x G F. 

If k ^p Ax.e then there exists e' G Exp-p such that A ::= Ax.e' G F. 

If k ^p ei@e2 then there exist e'^,e2 G Expr such that A ::= e'^@e2 G F. 
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Any production has one of the forms A ::= x, A ::= Ax.e, A ::= ei@e2. No production 
performed on a subterm (which must be a nonterminal) can give a new outermost syntactic 
term-constructor. 

The fohowing Lemma fohows from the definition of free variables of an extended ex- 
pression. 

Lemma 8.8. For a variable x: fv{x) = {x}. 

For an abstraction Ax.e.- fv{Xx.e) = fv{e) \ {x}. 

For an application ei@e2.' /f'(ei@e2) = fv{ei) [Jfv{e2). 

For a nonterminal A € N: fv{A) = {x | 3t.A =^>p t and x G fv{t)}. 

Lemma 8.9. For A £ N the sets subexps{k) and fv{k) are finite and computable. 

Proof is straightforward. 



8.2. Extended environment-based semantics. A semantics extending Definition 
addresses the problem of substitution in expressions with non-terminals. Environments 
bind A-variables (and not non-terminals) to values. 

Definition 8.10. (Extended states, values and environments) State, Value, Env are the 

smallest sets such that 
State = { e : p | e G Expr,p G Env and fv{e) C dom(p) } 

Value = { Ax.e : p \ Ax.e : p G State } 

Env = { p : X ^ Value \ X is a finite set of variables } 

The empty environment with domain X = is written []. The evaluation judgement form 
is s J| w where s S State, v £ Value. 

The following rules for calls and evaluations in the extended language are simple ex- 
tensions of the rules for pure A-calculus to also handle nonterminals. 

Definition 8.11. (Extended environment-based evaluation) The judgement forms are e : 
p ^ e' : p' and e : p ij. e' : p', where e,e' S Expr, e : p and e' : p' are states. The 
evaluation and call relations ^, — > are defined by the following inference rules, where = 



u ^ u ^ u 

d c 



A : p ^ e : p 

n 

e : p ^ e' : p' e' : p' ij- v 



A ::= e e r (GramX) New rule 



X 



xe{c,n} (ResultX) Extended Def. [33] (Apply) 



e : p ^ V 

The following rules have not been changed (but now expressions belong to Expr). 

(ValueX) — , _ , p(x) = e' : p' (VarX) 



Ax.e : p JJ. Ax.e : p x : p ij- e : p' 



(OperatorX) — r\ ^ ■ ^ ^ — ' — '■ — (OperandX) 



ei&e2 : p ^ ei : p ' ei(ae2 : p — > e2 : p 

r d 



e, : p II Xx.eo : po : p ij- v^ 
ei@e2 eo ■■ po[x V2\ 
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A F-extended program is a closed expression P S Expr- While evaluating a program in 
the extended language (P : [] -IJ- -), all calls and subevaluations will be from state to state. 

In pure A-calculus the evaluation relation is deterministic. The extended language is 
nondeterministic since a nonterminal A may have A ::= e for more than one e. 

Informally explained, consider closed extended A-expression e@B where nonterminal B 
satisfies fv{B) = {}. Then e@B represents application of e to all possible inputs generated 
by B. The analysis developed below can safely determine that e terminates on all inputs by 
analysing e@B. 

If a program in the extended language takes more than one input at a time, then we 
may rename the nonterminals and bound variables similarly as in a-conversion. As an 
example, if a program takes two Church numerals as input, then they can be given by two 
grammars identical in structure: 

Ci ::= Asi.Azi.Ai Ai ::= zi Ai ::= si@Ai and 

C2 ::= As2.Az2.A2 A2::=Z2 A2 ::= S2@A2 
and we can analyse the termination behaviour for (e@Ci)@C2. Such renaming can sometimes 
make the termination analysis more precise. 

Definition 8.12. Suppose e is a closed T-extended expression and nt{e) = {Ai,...,Afc} 

where F = {N, 11) is a A-regular grammar. By definition e is T -terminating iff 

e[ti/Ai,...,tfc/Afc] : [] ^ 
for all pure A-expressions ti, . . . , such that Aj tj for i = 1, . . . ,k. 

The following rules for calls and evaluations with size-change graphs in the extended 
language are simple extensions of the rules for pure A-calculus to also handle nonterminals. 

Definition 8.13. (Environment-based evaluation and call semantics utilizing size-change 
graphs) The judgement forms are e : p ^ e' : p' ,G and e : p ]}■ e' : p' ,G, where e,e' G 
Expr, e : p and e' : p' are states, source{G) = fv{e) U {e} and target{G) = fv{e') U {e}. 
The evaluation and call relations are defined by the following inference rules, where 

r d c n 

A ::= e e r (GramG) New rule 



A : p ^ e : p,id\ 
e : p ^ e' : p',G' e':p'i^v,G 



X 



e {c,n} (ResultG) Extended Def. ESI (ApplyG) 



e:p^v,G';G 

The following rules have not been changed (but now expressions belong to Expr)- 

— (ValueG) 

Ax.e : p i}. Ax.e : p, la^^^^ 

p(x) = e' -.p' (VarG) 



X : p 4 e' : p', {x ^ e} U {x i y I y e fv{e')} 



j— (OperatorG) ^ — ^^-^ — ^ — (OperandG) 

ei@e2 : p ^ ei : p, id^ ei@e2 : p — > 62 : /), id^ 

r ^ a ^ 
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ei : p 4 Ax.eo : po, Gi 62 : p -IL ^2, G2 



eo : po[x 



U2J,Ct]^ Ueo Ct2 



(CallG) 



Theorem 8.14. (The extracted graphs are safe) s — > s', G or s ij- s', G implies G is safe 
for {s,s'). 

Proof. This is shown by a case analysis as in the pure A-calculus. For the (GramG) rule it 
is immediate from the definition of free variables for non-terminals. □ 



8.3. Relating extended and pure A-calculus. 

The aim is now to show that execution of a program P in the extended language can simulate 
execution of any program Q in the pure A-calculus, where Q is derived from P by replacing 
each nonterminal occurrence A in P with a pure A-expression A can produce. The converse 
does not hold: it is possible that there are simulated executions that do not correspond to 
any instantiated program Q. We have however certified a number of programs to terminate 
when applied to arbitrary Church numerals. An example is given at the end of this section. 

Properties of the relation 

=>P relates expressions e' G Expr in the extended language to expressions e G Exp in the 
pure lambda-calculus. Notice that there are only the following possible forms of =^p-related 
expressions: 

X X Ax.e' =^p Ax.e e'^@e2 ei@e2 
A =^p X A =^p Ax.e A ei@e2 

The relation has the following inductive properties: 

A =>p t, for A G is given by definition 18.21 

variable x corresponds to the same variable x and nothing else. 
Ax.e' =>p Ax.e, iff e' =^p e, same x. 
e[@e2 ^p ei@e2 iff e'^ ^p ei and e'g ^p 62. 

Lemma 8.15. If e' ^p e then fv{e') D fv{e). 

Proof. This is by induction on the structure of e'. 
Case x =^p x, immediate. 

Case A t where N. By definition fv{k) = {x|3t.A =^>p t and x G fv{t)}. 

Case Ax.e' =^p Ax.e, iff e' ^p e. By induction the lemma holds for e' and e. Therefore 

/^;(Ax.e') = fv{e') \ {x} D >(e) \ {x} = >(Ax.e). 

Case e'^@e2 =^p ei@e2, iff e'^ =^p ei and e2 =^p e2. By induction the lemma holds for e'^, ei 
and e'2,e2. Rence fv{e[@e'2) = fv{e[) U fv{e'^) ^ fv{ei) U fv{e2) = fv{ei@e2). □ 
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If e € Exp, i.e., no nonterminals occur in e, then e =^p e. If A e then there exist 
t ^ N such that A ::= t and t e. 

Definition 8.16. The relation S between states 

Define the relation S between states in the extended language and states in the pure A- 
calculus as the smallest relation S such that: 

S{e' : p' ,e : p) if e' =^p e and for all x € fv{e) it holds that 5(p'(x), p(x)). 

If e : p is a state in the pure lambda calculus then it is also a state in the extended 
language and ^(e : p,e : p). 

Lemma 8.17. If S{k : p',e : p) and A ::= t, t =^p e then also S{t : p',e : p). □ 

We now define a relation T between size-change graphs. The intention is that T{G' , G) 
is to hold when the only difference in the generation of the graphs is due to nonterminals 
that take the place of pure lambda expressions. 

Definition 8.18. The relation T between size-change graphs 

Define T{G', G) to hold iff 

i) source{G') ^ source{G) and target{G') D target{G). 

ii) The subgraph of G' restricted to source{G) and target{G) is a subset of G. 

iii) Furthermore if z € source{G') \ source{G) then either there is no edge from z in G' or 
the only edge from z in G' is (z ^ z), and if (z ^ z) G G' then z ^ target{G). 

We have that T(Gq,Go), T(G'^,Gi), target(GQ) = source(G']^) and target(Go) = source(Gi) 
together imply that t({G'q; G^), (Gq; Gi)) holds. 

Lemma 8.19. Simulation Property 

i) If S{e' : p' ,e : p) and e : p ^ : p^^G then there exist Gq : Pq, G' with 5(eQ : pg, eg : po) 
and r(G', G) such that e' : p' ^ g'q : p'p, G'. 

ii) If S{e' : p', e : p) and e : p — > gq : po,G with x G {r, d, c} then there exist Gq : Pq,G' 

and possibly s such that either e' : p' — > Gq : Pq,G' or g' : p' ^ s — > Gq : Pq,G' with 
^(gq : Pq, Go : po), T{G' , G), and in the last case S{s, e : p). 

The composite size-change graph for the double-call g' : p' — > s ^ Gq : Pq will have 
the same edges as G' because the — > call generates an id^ graph. 

n 

Corollary 8.20. For programs P € Expr and Q E P =^p Q /loWs t/iat.' 

If Q : W ^* e : p then there exists e' : p' such that P : [] e' : p' and S{e' : p', g : p). 
If Q : [] J| G : p then there exist e' : p' suc/i that P : [] |L g' : p' and ^(g' : p', g : p). 

Also notice that if gi : pi ^ G2 : P2 then fv{ei) ^ fv{e2) by the definition of free 

variables for nonterminals. (By definition, S{e' : p',e : p) implies fv{e') ^ fv(e).) 

Proof. Lemma 18.191 is shown by induction on the tree for the proof of evaluation or call in 
the pure A-calculus and uses the observation about free variables. Proof is in the appendix. 

□ 
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8.4. The subexpression property. 

Definition 8.21. Given a state s in the extended language, we define its expression support 
exp-sup{s) by 

exp-sup{e : p) = subexps{e) U exp_sup{p{x)) 

xefv{e) 

Lemma 8.22. (Subexpression property) //s JJ. s' or s ^ s' then exp-sup[s) ^ exp-sup{s'). 

Corollary 8.23. // P : Q J| Ax.e : p then Ax.e G suhexp{V). // P : [] ^* e : p then 
e G suhexpsiy). 

Tlie proof of Lemma 18.221 follows the same lines as the proof of Lemma 13.81 The proof 
for the rule (Gram) is immediate from the definition of subexpressions in the extended 
language. Proof omitted. 

8.5. Approximate extended semantics with size-change graphs. 

Definition 8.24. (Approximate evaluation and call rules for extended semantics with size- 
change graphs). The judgement forms are now e — > e', G and e JJ- e', G, where e, e' G Expr, 
and source{G) = fv{e) U {e} and target{G) = fv{e') U {e}. 

e^e',G' e'\^v,G 
A ::= e e r (GramAG) — ^ „. ni . n ^ ^ {^'"^ (ResultAG) 

(VarAG) 



ei@e2 G sit6exps(P) ei4|Ax.eo,Gi ^2^^2,02 



X J| U2, {x ^ e} U {x i y I y G fv(y2)} 

(OperatorAG) — (OperandAG) 



ei@e2 ei, id\^ ei@e2 — * e2, id\, 

ei|lAx.eo,Gi e2-IJ-^2,G'2 fCallAGl 
WTw:7d^ (ValueAG) ^^^^^ _^ ^-./Ax.eo ^.^x 

Putting the pieces together, we now show how to analyse any program in the regular 
grammar-extended A-calculus . Let P be a program in the extended language. 

Definition 8.25. 

absintExt{P) = 

{Gj \j>OA 3ei, Gi, (0 < i < j) : P = Go A (eo ^ ei, d) A ... A (e^-^i ^ e^, Gj) } 

Theorem 8.26. The set absintExt{P) can be effectively computed from P. 

Proof. In the extended A-calculus there is only a fixed number of subexpressions of P, and 
a fixed number of of possible size-change graphs with 

source, target C {e} U {x | x is a variable that occurs in a subexpression of P} 

Thus absintExtiy) can be computed in finite time by applying Definition 18.241 exhaustivelv. 
starting with P, until no new graphs or subexpressions are obtained. □ 
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8.6. Simulation properties of approximate extended semantics. We will show the 
following properties of approximate extended semantics: 

(1) Calls and evaluations for a program in extended semantics with environments can be 
stepwise simulated by approximate extended semantics with identical size-change graphs 
associated with corresponding calls and evaluations. To a call or evaluation in the 
extended A-calculus with environments corresponds the same call or evaluation with 
environments removed. 

(2) Suppose P Q for programs P,Q. Then calls and evaluations for Q in the pure lambda 
calculus with environments can be simulated by calls and evaluations in the approximate 
extended semantics for P using the relations and T. 

(3) The extra edges in the size-change graphs in extended semantics can never give rise to 
incorrect termination analysis. 

Lemma 8.27. Let P be a program in the extended language and P : [] ^* e : p. 
If e : p ^ : pq,G then e — > eg, G in approximate semantics. 
If e : p ij- bq : pq,G then e JJ. eo, G in approximate semantics. 

Proof. The proof is similar to the proof for approximation of the pure lambda-calculus 13.111 
and 16.10] For rules (Value), (Operator), (Operand) it is immediate. The (Gram)-rule do 
not refer to the environment, hence the lemma holds if the (Gram)-rule has been applied. 
For rules (Call) and (Result) it holds by induction. For the (Var)-rule we need induction 
on the total size of the derivation, and we can argue as in the case of the pure lambda 
calculus. □ 

Lemma 8.28. Let P be a program in the extended language and Q a program in the pure 
X- calculus with P =^p Q. 

/f Q : [] ^* e : p and e : p JJ. gq : po, G then there exist e', Gq, G' with e' ^p e, b'q =^p gq, 
T{G', G) such that P ^* e' and e' ^ e^, G'. 

// Q : [] — >* e : p and e : p — > gq : poi G, x G {r,d,c} then there exist e',eQ,G' with 

e' e ,e'n. eg, T(G', G) such that P ^* e' and either e' eL G' or e' e" —* e^, G' 
where in the last case G' is the composite size-change graph for the double call. 

Proof. The lemma follows from the simulation property lemma 18.191 together with lemma 
18:271 □ 

Theorem 8.29. 

(1) Let P be a program in the extended language. If there is a program Q in the pure lambda- 
calculus such that P =^p Q and there exists an infinite call-sequence in the call-graph for 
Q in the exact semantics, then there exists an infinite call-sequence with no infinitely 
descending thread in the call-graph for P in the approximate extended semantics. 

(2) It follows that if each infinite call-sequence in the call-graphs for P in the approximate 
extended semantics has an infinitely descending thread, then P is T -terminating. 

Proof. (1): Assume an infinite call-sequence exists in the call- graph for Q. By the safety 
of the size-change graphs in the pure A-calculus, the size-change graphs associated with 
this call sequence cannot have an infinitely descending thread. By lemma [8.281 there exists 
a simulating call-sequence in the call-graph for P such that the corresponding size-change 
graphs are in the T-relation. Let Gp,Gq be any such two corresponding T-related size- 
change graphs from these call-sequences, T{Gp,Gq). By the definition of the T-relation 
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it holds that the largest subgraph of Gp, with source and target the same as source^Gq) 
and target{GQ), is equal to or a subset of Gq. We need to show that the possible extra 
variables in the size-change graphs for the simulating sequence in the call-graph for P can 
never take part in an infinitely descending thread. By the definition of the T-relation it 
holds that an edge leaving from such a variable x must have have the form (x x) if any 
exists in the simulating sequence. Also by the definition of the T-relation, if T{Gp, Gq) and 
(x — > x) € Gp then x ^ codomain{GQ). Hence either an extra thread in the size-change 
graphs going out from x will be finite or it will be infinitely equal x — > x ^ x — > . . ., i.e. 
an extra variable can never take part in an infinitely descending thread in the simulating 
sequence. 

(2) is a corollary to (1). □ 

Example 8.30. The following is an example of a program certified to terminate by our 
proof method. The program computes when applied to two arbitrary Church numerals 
for X and n. In Section [7] we analysed the program applied to Church numerals 3 and 4 
(Example E2D. 

Grammar for Church numerals: C ::= As.Az.A A ::= z | s@A 
The program applied to two Church numerals: 

[Ani . An2 . ni — n — 



[Ar.Aa. 11: (r@ 13: (r@a))] ~ g 



@ [A k.A p. A q. (p@((k@p)@q))] - succ- 
n2 ] — X — 

@ C — Church numeral — 

@ C — Church numeral — 

Following is the output from program analysis. The analysis found the following loops 
from a program point to itself with the associated size-change graph and path. The first 
number refers to the program point, then comes a list of edges and last a list of numbers, 
the other program points that the loop passes through. The program points are found 
automatically by the analysis. The program points 30 and 32 are not written into the 
presentation of the program because they involve the subexpression A of a Church numeral. 
The subexpression associated with 30 is A and the subexpression associated with 32 is 
sOA. The loops from 30 to itself and from 32 to itself in the output correspond to the call 
sequence A— >s@A^A— >s@A. . . . 

SELF SCGS no repetition of graphs: 
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11 


[(r 
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,r)] 






[] 


11 - 


11 
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,a) , (r 


,>,r)] 




[13] 


13 - 


13 
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,a) , (r 


,>,r)] 




[11] 


13 - 


13 


[(r 
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,r)] 






[11,11] 


30 - 


30 


[(e, 




e) , (s, 


=, s) , (z,=,z)] 


[32] 




32 


32 


[(£, 


>, 


e) , (s, 


=,s) , (z,=,z)] 


[30] 





Size-Change Termination: Yes 
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9. Concluding matters 

We have developed a method based on The Size-Change Principle to show termination 
of a closed expression in the untyped A-calculus. This is further developed to analyse if 
a program in the A-calculus will terminate when applied to any input from a given input 
set defined by a tree grammar. The analysis is safe and the method can be completely 
automated. We have a simple first implementation. The method certifies termination 
of many interesting recursive programs, including programs with mutual recursion and 
parameter exchange. 

Acknowledgements. The authors gratefully acknowledge detailed and constructive com- 
ments by Arne Glenstrup, Chin Soon Lee and Damien Sereni, and insightful comments 
by Luke Ong, David Wahlstedt and Andreas Abel. 

Appendix A. Proof of Lemma 12.61 

Proof. =>: Assume P JJ-. To show: CT has no infinite call chain starting with P. The proof 
is by induction on the height of the proof tree. Each call rule of 12.61 is associated with a 
use of rule (ApplyS) from Definition 12. 2i So if P is a value, there is no call from P. If P -l| is 
concluded by rule (ApplyS), then P = ei@e2 and by induction there is no infinite call chain 
starting with ei, 62 and eo[f2/x]. All call chains starting with P go directly to one of these. 
So, there are no infinite call chains starting with P. 

<^=: Assume CT has no infinite call chain starting with P. To show: P J|. Since the call 
tree is finitely branching, by Konig's lemma the whole call tree is finite, and hence there 
exists a finite number m bounding the length of all branches. 

We prove that e J| for any expression in the call tree, by induction on the maximal 
length n of a call chain from e. 

n = : e is an abstraction that evaluates to itself. 

n > : e must be an application e = ei@e2. By rule (Operator) there is a call 

ei@e2 ei, and the maximal length of a call chain from ei is less than n. By induction 

d 

there exists vi such that ei JJ- vi. We now conclude by rule (Operand) that ei@e2 62. 

By induction there exists V2 such that 82 JJ- f2- 

All values are abstractions, so we can write vi = Ax. 69. We now conclude by rule (Call) 
that ei@e2 —>■ eo[f2/x]. By induction again, eo[f2/x] J| v for some v. This gives us all 

premises for the (ApplyS) rule of Definition 12.21 so e = ei@e2 i}- v. □ 



Appendix B. Proof of Lemma [3.111 

Proof. To be shown- If P : [] e : p and e : p ^ e' : p', then e ^ e'. 

If P : [] ^* e : p and e : p ^ e' : p', then e e'. 
We prove both parts of Lemma 13.111 by course-of- value induction over the size n = |D| of a 
deduction V by Definition 13.31 of the assumption 

P : [] ^* e : p A e : p JJ. e' : p' or P : [] ^* e : p A e : p ^ e' : p' 

The deduction size may be thought of as the number of steps in the computation of e : p JJ- 
e' : p' or e : p ^ e' : p' starting from P : []. 
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The induction hypothesis IH[n) is that the Lemma holds for all deductions of size not 
exceeding n. This implies that the Lemma holds for all calls and evaluations performed in 
the computation before the last conclusion giving (P : [] e : p and e : p JJ. e' : p') or 
(P : [] e : p and e : p e' : p'), i.e., the Lemma holds for premises of the rule last 
applied, and for any call and evaluation in the computation until then. 

Proof is by cases on which rule is applied to conclude e : p ]}■ e' : p' or e : p ^ e' : p' . 
In all cases we show that some corresponding abstract interpretation rules can be applied 
to give the desired conclusion. 

Base cases: Rule (Value), (Operator) and (Operand) in the exact semantics (def. 13. 3p 
are modeled by axioms (ValueA), (Operator A) and (OperandA) in the abstract semantics 
(def. IS.lOp . These are the same as their exact-evaluation counterparts, after removal of 
environments for (ValueA) and (Operator A), and a premise as well for (OperandA). Hence 
the Lemma holds if one of these rules was the last one applied. 

The (Var) rule is, however, rather different from the (VarA) rule. If (Var) was applied 
to a variable x then the assumption is (P : [] — x : p and x : p ij. e' : p'). In this case 
X G dom{p) and e' : p' = p{x). The total size of the deduction (of both parts together) is n. 

Now P : [] —5-* X : p begins from the empty environment, and we know all calls are from 
state to state. The only possible way x can have been bound is by a previous use of the 
(Call) rule, the only rule that extends an environment!! 

The premises of the (Call) rule require that operator and operand in an application 
have previously been evaluated. So it must be the case that there exist ei@e2 : p" and 
Ax.eo : po such that (P : [] ei@e2 : p" and ei : p" JJ. Ax.eo : po and 62 : p" JJ- e' : p') 
and the size of both deductions are strictly smaller than n. By the Subexpression Lemma, 
ei@e2 S subexp(P). By induction. Lemma [3. Ill holds for both ei : p" J| Ax.gq : po and 
62 : p" e' : p', so ei JJ. Ax. eg and 62 JJ- e' in the abstract semantics. Now we have all 
premises of rule (VarA), so we can conclude that x J| e' as required. 

For remaining rules (Apply) and (Call), when we assume that the Lemma holds for the 
premises in the rule applied to conclude e JJ. e' or e — > e', then this gives us the premises for 
the corresponding rule for abstract interpretation. From this we can conclude the desired 
result. □ 

Appendix C. Proof of Lemma [531 

Proof. Define the length L{e) of an expression e by: 

L(x) = l L(Ax.e) = 1 + L(e) L(ei@e2) = 1 + L(ei) + L(e2) 

For any expression e, ^^(e) is a natural number > 0. For a program, the length of the initial 
expression bounds all lengths of occurring expressions. 

Define for a state s the height H{s) of the state to be the height of the environment: 

H{e : p) = max{{l + H{p{x))) \ x G fv{e)} 

So, H{e : []) =0 the maximum of the empty set, and for any state e : p,H{e : p) is 
a natural number > 0. Let >iex stand for lexicographic order relation on pairs of natural 

^This must have occurred in the part P ; [] ^* x : p. 
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numbers, hence >iex is well-founded. We prove that the relation >- on states is well-founded 
by proving that ei : pi >- 62 : P2 implies that 

(i?(ei : pi),L(ei)) >iex {H{e2 : p2),L{e2)) 

First, consider Clearly, if ei : pi 62 : p2 then H{ei : pi) > H{e2 : P2)- Hence 
even though L{e2) might be larger than L{ei), it holds that in the lexicographic order 
(-^(ei : pi),L(ei)) >iex {H{e2 ■ P2),L{e2)). 

Now, consider ^2- If ei : pi >-2 ^2 ■ P2 then H{ei : pi) > H{e2 : P2) and L{ei) > L{e2), 
hence in the lexicographic order {H{ei : pi),L(ei)) >if,x {H{e2 : p2)-,L{e2))- Trivially, 
ei : /9i = 62 : P2 implies {H{ei : pi),L{ei)) =iex {H{e2 : p2),L{e2)). 

Recall, by definition ^ is the transitive closure of U ^2 U =, and si >- S2 holds when 
si >z S2 and si 7^ S2- So, from the derivations above we can conclude that ei : pi ^ 62 : P2 
implies {H{ei : pi),L(ei)) >iex {H{e2 : P2) ■, L{e2)) ■, hence the relation >- on states is well- 
founded. 

□ 



Appendix D. Proof of Theorem 16.81 

Proof. For the "safety" theorem we use induction on proofs of s JJ- s', G or s ^ s', G. Safety 
of the constructed graphs for rules (ValueG), (OperatorG) and (OperandG) is immediate 
by Definitions 16.21 and 15. 3i 

In the following x, y, z are variables and p, q can be variables or e. 

The variable lookup rule (VarG) yields x : p JJ- p(x),G with G = {x i y | y € 
fv{e')} U {x ^ e} and p(x) = e' : p' . By Definition 15.21 x : p(x) = /9(x)(e), so arc x ^ e 

satisfies Definition 16. 2i Further, if x y G G then y E fv{e'). Thus x : p{x) = p{x) = e' : 

p' ^ p'iy) = p{^){y) as required. 

The rule (CallG) concludes s s',G, where s = ei@e2 : p and s' = gq : po[x 1— > V2] 

and G = Gi '^1^^-^^ G^^^. Its premises are ei : p J| Ax.gq : po, Gi and 62 : /) J| ^2, G2. We 
assume inductively that Gi is safe for (ei : p, Ax.gq : po) and that G2 is safe for (e2 : p, ^2)- 
Let ^2 = e' : p'. 

We wish to show safety: that p — > p' G G implies = s'{p'), and p ^ p' £ G implies 
s{p) >- By definition of G^'/^''-^o and G|^^ p ^ p' e G = Gp/^^-^o G|^^ breaks 

into 7 cases: 

Case i.' y i z € G]^ e/Xx.eg 1^^^^^^^ y ^ 2 e Gi. By safety of Gi, ei : p(y) >- Ax.gq : Po(z). 
Thus, as required, 

s(y) = ei@e2 : p(y) = erTp(y) >~ Ax.gq : po(z) = gq : po[x 1-^ f2](z) = s'{z) 
Case 2: y —>■ z Gi '^1^^-'^^ because y ^ z G Gi. Like Case 1. 

Case 3: Y e €^ Gi because y -^^ e G Gi, then x ^ fv{eo) by the definition of G^ ^/^^-^o 

and then gq : po[x 1-^ ^2] = gq : po- By safety of Gi, gi : p(y) >z Ax.gq : Po(e) = Ax.gq : po- 
Thus, as required, 

s(y) = Gi@G2 : p(y) = Grrp(y) h Ax.gq : po gq : po = s'{e) 
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Case 4-' £ ^ P & ' " because e ^ p Gi. Then it holds that either p is a variable- 

name or X ^ /w(eo). Now e in Gi refers to ei : p, so ei : p ^ Ax.gq : Po{p) by safety of Gi. 
Thus, as required, 

s(e) = ei@e2 : p ^ ei : p ^ Ax.eo : po{p) >z eo ■ Po[x ^ V2]{p) = s'{p) 

Case 5; y i X € G because x € fv{eo) and y x G Gl""*^ because y e € G2. By safety 
of G2, 62 : p(y) >- V2{e)- Thus, as required, 

s(y) = ei@e2 : p{y) = eJTp(y) >- v^{e) = bq : po[x ^ f2](x) = s'(x) 

Case 6: y —I- X & G because x € fv{eo) and y ^ x G because y —>■ e £ G2- Like Case 

5. 

Case 7: e —> X £ G because x € /f (eo) and e —> x £ G^^^ because e ^ e € G2. By safety of 
G2, e2 : /o(e) = 62 : p. Thus, as required, 

s(e) = ei@e2 : p ^ 62 : p ^ U2(e) = Po[x ^ ■^2](x) = s'(x) 

The rule (ApplyG) concludes s JJ. v,G';G from premises s — > s',G' and s' JJ. 

where s = ei@e2 : p and s' = e' : p'. We assume inductively that G' is safe for {s,s') and 
G is safe for (s', w). Let Go = G'; G. 

We wish to show that Go is safe: that p ^ q E Go implies s{p) = v{q), and p 5 G Go 
implies s{p) >- v{q) (p,q can be variables or e). First, consider the case p — > g G Go- 
Definition 14.21 implies p ^ p' (z G' and p' ^ q (z G for some p' . Thus by the inductive 
assumptions we have s{p) = s'{p') = v{q), as required. 

Second, consider the case p q Gq. Definition 14.21 implies p ^ p' £ G' and 
p' ^ q £ G for some p' , where either one or both of ri, r2 are J,. By the inductive assumptions 
we have s{p) >z s'{p') and s'{p') ^ v{q), and one or both of s{p) >~ s'{p') and s'{p') >- v{q) 
hold. By Definition of >- and >z this implies that s(j)) >- v{q), as required. 

□ 

Appendix E. Proof of Lemma 16.101 

Proof. The rules are the same as in Section [3T0l only extended with size-change graphs. We 
need to add to Lemma [3.111 that the size-change graphs generated for calls and evaluations 
can also be generated by the abstract interpretation. The proof is by cases on which rule is 
applied to conclude e J| e', G or e : p — > e' : p', G. 

We build on Lemma [3. IH and we saw in the proof of this that in abstract interpretation 
we can always use a rule corresponding to the one used in exact computation to prove 
corresponding steps. The induction hypothesis is that the Lemma holds for the premises of 
the rule in exact semantics. 

Base case (VarAG): By Lemma 13.111 we have x : p Jj. e' : p' implies x JJ. e'. The size- 
change graph built in (VarAG) is derived in the same way from x and e' as in rule (VarG), 
and they will therefore be identical. 

For other call- and evaluation rules without premises, the abstract evaluation rule is as 
the exact-evaluation rule, only with environments removed, and the generated size-change 
graphs are not infiuenced by environments. Hence the Lemma will hold if these rules are 
applied. 
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For all other rules in a computation: When we know that Lemma [3.111 holds and assume 
that Lemma 16.101 hold for the premises, then we can conclude that if this rule is applied, 
then Lemma |6. 101 holds by the corresponding rule from abstract interpretation. □ 

Appendix F. Proof of Lemma [8.191 

Proof. By induction on the tree for the proof of evaluation or call in the pure A-calculus. 

Possible cases of the structure of e' : p' and e : p in S'-related states: 

(x -.p', X : p) (Ax.e' : p' , Ax.e : p) (e^Oe^ : p' , ei@e2 : p) 

(A : p', X : p) (A : p' , Ax.e : p) (A : p' , ei@e2 : p) 

Base cases, evaluations and calls in pure A-calculus by rules without premisses. 

Case S{x : /j',x : p): No calls from x : p. 

(Var)-rule, x : p ij. p(x) = eo : po, {x ^ e} U {x i y | y G fv{eo)} and x : p' i}. p'{x) = 

Gq : Pq, {x ^ e} U {x i y I y G (gq)}. Beginning from S-related states, by defintion 
of the relation S we have S{p' {x), p{x)) and fv{e'Q) D fv^eo). source{G') = source{G) and 
the generation of size-change graphs gives that the restriction of G' to target{G) equals G, 
hence T{G',G). 

Case ^(Ax.g' : p',Xx.e : p): No calls from Ax.g : p. 

(Value)-rule, Ax.e : p JJ- Ax.e : p, id^^^ and Ax.e' : p' J| Ax.e' : p', id^^ ^,. T{id^^ ^,, id^^^). 
Case 5(e'^@e2 : p',ei@e2 : p): 

(Operator)-rule, ei@e2 : p — > ei : p, idj;_^ and e'i@e'2 : p' e'^ : p' , id^, .Beginning from 
S'-related states, by defintion of the relation S we have S{e[ : p', ei : p). Then T{id^, , idj;^) 

Case S{A : p', X : p): 

(Var)-rule: x : p JJ- p(x) = eo : po,G where G = {x — e} U {x -'^ y [ y G /f(eo)}. By 
the definition of S we must have A =>p x. This againg by lemma 18.71 gives that we must 
have A ::= x. Then A : p' x : p' , id"^ by (Gram)-rule, and we have ^(x : p',x : p). Also 

X : p' ^ p'(x) = e'o : p'o, G" where G" = {x ^ e} U {x ^ y | y G /t;(e'o)} by (Var)-rule. The 
edges in G" are the same as the edges in G' = id"^; G" . Hence by (Result)-rule A ij. p'(x), G' . 
As before S(p'(x), p(x)) and T(G',G). 

Cases ^(A : p',Ax.e : p) with (Value)-rule, and ^(A : p',ei@e2 : p) with (Operator)-rule: 
Similarly by use of lemma [HTTI and reasoning as above. We will use the rules (Gram) (Value) 
(Result) and (Gram) (Operator) respectively, where (Value) and (Operator) do not have 
premises. 

Step cases. 

Case S'(e']^@e2 : p',ei@e2 : p). ei@e2 : p 62 : p, ^d^^ by (Operand)-rule. It follows from 
the definition of S that also ^(e'^ : p', ei : p) hence by IH since ei : p |L then also e'^^ : p' ij. 
and then e'^^Qej : p' ^ eg : p', id^, and by the definition of S we have ^(eg : p',e2 : p), 
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The next case is the one that requires the most consideration to see that we stay within 
the T-relation. Assume we know for graphs G',G, that the restriction of G' to source 
and target of G* is a subset of G. Notice, if y G source{G') \ source{G) and x, z G 
target{G') \ target(G), then for testing T{G' , G) we only need to look at which edges leaves 
from X, y, we do not need to care about if other edges goes into x, z. 

Case 5(e'i@e^ : p',ei@e2 : p). ei@e2 : p ^ eg : po[x ^ 1^2], Gp/^""" Ue^ by (Call)- 

rule, where we have the premises ei : p JJ. Ax.eo : poi Gi and 62 : p JJ- W2> G2. 

It follows from the definition of S that also S(e'i : p', ei : p) and S{e2 : p', 62 : p). Hence 
by IH since ei : p ij. Ax.eo : po, Gi then also e'^ : p' JJ. G'^ where T{G[, Gi) and S{v, Ax.gq : 
po)- Then by definition of values, relations =^>p and S we must have v = Ax.Gq : pg. Also 
by IH since 62 : p JJ- V2,G2 then also e'g : p' -IJ- f2,G2 where T{G'2-,G2) and 5'(f2,f2)- Then 

we have the premises to conclude e'^@e2 : p' — > Gq : Pq[x ^ ^'2]'^'i Ug^ ^2*^^- By 

definition of S we have ^(eg : Pq[x 1-^ W2],eo : po[x 1-^ V2]). We notice that x ^ /'y(Ax.eQ) 
and therefore (p — > x) ^ G'^. 

We consider different possibilities for the generated graphs: 

If X G /w(eQ) but X ^ fv{eo) then we can have some extra edges going to x in extended 
semantics where we will have no edges to x in pure semantics because x is not in the target, 
but this is acceptable in the T-relation. There can also be some extra edges going to e in 
pure semantics where no edges go to e in exact semantics, but as e is within the codomain 
in pure semantics, this is also acceptable in the T-relation. Since T(G'^, Gi) it will still hold 

that r(G''-^/^^-^0 y^, Q,e^:.^Q-e/Xx.eo ^e^x)^ 

If X G /w(eo) then also x G fv{e'Q) and if x ^ f'^i^o) then x ^ fv{eo), in these cases since 
T(G;,Gi) and T{G'2,G2) also TiG''"^^""'^'" Ug/ G^'=^^ Gf/^'^-"" Ueo G|^^). 

Case S{A : p',ei@e2 : p) with ei@e2 : p 62 : p,idl^ by(Operand)-rule. By the defi- 
nition of S we must have A =>p ei@e2. This againg by lemma [8771 gives that we must have 
A ::= e'^@e2. Then A : p' — > e[@e'2 : p', id"^, by (Gram)-rule, and we have 5(e'^@e2 : 

p',ei@e2 : p). Then we have seen that e'^@e2 : p' e'2 : p',id^, with S{e2 : p', 62 : p), 
T{id^, , idgg) have that the edges of id^, are the same as the edges of («c?p@g/ ; id^, ) 

hence T{{id=,^^^r, idl,J, idi^). 

Case S{A : p', ei@e2 : p) with (Call)-rule ei@e2 : p —> cq : po[x 1-^ ^2], G: Similarly as before 

we have ^ : p' — > e'lQcn : p', id'z, ^ , by (Gram)-rule, and we have 5(ei@e2 : p, e'lQcn : p'). 

We can now use the derivation above and with the notation from above we have e[@e2 : 
p' ^ c'q : Pq[x ^ V2], G' with 5(eo : Po[x 1— > ^2], eo : po[x 1— > ^2]) and T{G', G). Looking into 

the derivation of G' we find that the edges of G' are the same as the edges of [id^i ^ , ; G'). 

Case ^(e' : p',e : p), e : p JJ. w, G by (Result)-rule, where we have the premises e : 
p — > : ps,Gs and : p^ JJ. w,G^, G = Gs^G^: By IH since e : p ^ es : Ps,Gs then 
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e' : />' ^ : p' ^ e', : p'„ G', with 5(e', : p'„ : ps), and T(G'„ G.), j G {0, 1}. Again by 

IH since e, : ^ v, G„ then e', : p'^ ^ v', with S{v, v') and r(G;, G^). Let G' = G',; G'^ 
then T(G',G). If j = we have the premises to conclude e' : p' JJ. v',G'. If j = 1 by 
lemma [5. 171 we have S{s : p',e : p) and we have the premises to conclude s : p' J| v\ G, and 
by applications of (Result)-rule once more in the extended semantics we can also conclude 
e' : p' i\. v' , idf; G' where the edge set of G' is the same as the edge set of idj^; G' . □ 
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